Parameter Inference of Cost-Sensitive Boosting Algorithms
نویسندگان
چکیده
Several cost-sensitive boosting algorithms have been reported as effective methods in dealing with class imbalance problem. Misclassification costs, which reflect the different level of class identification importance, are integrated into the weight update formula of AdaBoost algorithm. Yet, it has been shown that the weight update parameter of AdaBoost is induced so as the training error can be reduced most rapidly. This is the most crucial step of AdaBoost in converting a weak learning algorithm into a strong one. However, most reported cost-sensitive boosting algorithms ignore such a property. In this paper, we come up with three versions of cost-sensitive AdaBoost algorithms where the parameters for sample weight updating are induced. Then, their identification abilities on the small classes are tested on four “real world” medical data sets taken from UCI Machine Learning Database based on F-measure. Our experimental results show that one of our proposed cost-sensitive AdaBoost algorithms is superior in achieving the best identification ability on the small class among all reported cost-sensitive boosting algorithms.
منابع مشابه
Cost-sensitive Boosting with p-norm Loss Functions and its Applications
In practical applications of classification, there are often varying costs associated with different types of misclassification (e.g. fraud detection, anomaly detection and medical diagnosis), motivating the need for the so-called ”cost-sensitive” classification. In this paper, we introduce a family of novel boosting methods for cost-sensitive classification by applying the theory of gradient b...
متن کاملOnline Ensemble Learning for Imbalanced Data Streams
While both cost-sensitive learning and online learning have been studied extensively, the effort in simultaneously dealing with these two issues is limited. Aiming at this challenge task, a novel learning framework is proposed in this paper. The key idea is based on the fusion of online ensemble algorithms and the state of the art batch mode cost-sensitive bagging/boosting algorithms. Within th...
متن کاملCost-sensitive Boosting for Concept Drift
Concept drift is a phenomenon typically experienced when data distributions change continuously over a period of time. In this paper we propose a cost-sensitive boosting approach for learning under concept drift. The proposed methodology estimates relevance costs of ‘old’ data samples w.r.t. to ‘newer’ samples and integrates it into the boosting process. We experiment this methodology on usenet...
متن کاملCost-Sensitive Boosting for Classification of Imbalanced Data
The classification of data with imbalanced class distributions has posed a significant drawback in the performance attainable by most well-developed classification systems, which assume relatively balanced class distributions. This problem is especially crucial in many application domains, such as medical diagnosis, fraud detection, network intrusion, etc., which are of great importance in mach...
متن کاملProposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms
In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...
متن کامل